Sparse reconstruction by convex relaxation: 
Fourier and Gaussian measurements 



o 
o 



(N 
< 



> 

in 
in 

o 

O 



c3 



X 



Mark Rudelson 
Department of Mathematics 
University of Missouri, Columbia 

Columbia, Missouri 65211 
Email : rudel son @ math . mi ssouri.edu 



Abstract — This paper proves best known guarantees for exact 
reconstruction of a sparse signal / from few non-adaptive uni- 
versal linear measurements. We consider Fourier measurements 
(random sample of frequencies of /) and random Gaussian 
measurements. The method for reconstruction that has recently 
gained momentum in the Sparse Approximation Theory is to 
relax this highly non-convex problem to a convex problem, and 
then solve it as a linear program. What are best guarantees 
for the reconstruction problem to be equivalent to its convex 
relaxation is an open question. Recent work shows that the 
number of measurements k(r, n) needed to exactly reconstruct 
any r-sparse signal / of length n from its linear measurements 
with convex relaxation is usually 0(r polylog(n)). However, 
known guarantees involve huge constants, in spite of very good 
performance of the algorithms in practice. In attempt to reconcile 
theory with practice, we prove the first guarantees for universal 
measurements (i.e. which work for all sparse functions) with 
reasonable constants. For Gaussian measurements, k(r,n) < 
11.7 r [l.5 + log(n/r)l, which is optimal up to constants. For 
Fourier measurements, we prove the best known bound k(r, n) = 
0(r log(n) • log 2 (r) log(r logn)), which is optimal within the 
log log n and log r factors. Our arguments are based on the 
technique of Geometric Functional Analysis and Probability in 
Banach spaces. 

I. Introduction 

During the last two years, the Sparse Approximation Theory 
benefited from a rapid development of methods based on the 
Linear Programming. The idea was to relax a sparse recov- 
ery problem to a convex optimization problem. The convex 
problem can be further be rendered as a linear program, and 
analyzed with all available methods of Linear Programming. 

Convex relaxation of sparse recovery problems can be traced 
back in its rudimentary form to mid-seventies; references to 
its early history can be found in [26]. With the development 
of fast methods of Linear Programming in the eighties, the 
idea of convex relaxation became truly promising. It was put 
forward most enthusiastically and successfully by Donoho 
and his collaborators since the late eighties, starting from the 
seminal paper [15] (see Theorem 8, attributed there to Logan, 
and Theorem 9). There is extensive work being carried out, 
both in theory and in practice, based on the convex relaxation 
[8], [14], [16], [17], [13], [19], [24], [25], [26], [11], [9], [10], 
[12], [2], [1], [4], [5], [23], [3], [6], [20]. 

To have theoretical guarantees for the convex relaxation 
method, one needs to show that the sparse approximation 
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problem is equivalent to its convex relaxation. Proving this 
presents a mathematical challenge. Known theoretical guar- 
antees work only for random measurements (e.g. random 
Gaussian and Fourier measurements). Even when there is a 
theoretical guarantee, it involves intractable or very large con- 
stants, far worse than in the observed practical performances. 

In this paper, we substantially improve best known theo- 
retical guarantees for random Gaussian and Fourier (and non- 
harmonic Fourier) measurements. For the first time, we are 
able to prove guarantees with reasonable constants (although 
only for Gaussian measurements). Our proofs are based on 
methods of Geometric Functional Analysis, Such methods 
were recently successfully used for related problems [23], [20]. 
As a result, our proofs are reasonably short (and hopefully, 
transparent). 

In Section|nl we state the sparse reconstruction problem and 
describe the convex relaxation method. A guarantee of its cor- 
rectness is a very general restricted isometry condition on the 
measurement ensemble, due to Candes and Tao ([5], see [3]). 
Under this condition, the reconstruction problem with respect 
to these measurements is equivalent to its convex relaxation. 
In Sections [HI] and IIVI we improve best known guarantees 
for the sparse reconstruction from random Fourier (and non- 
harmonic Fourier) measurements and Gaussian measurements 
(Theorem 13.11 and 14.11 respectively). 



II. 



The Sparse Reconstruction Problem and its 
Convex Relaxation 



We want to reconstruct an unknown signal / 6 C" from 
linear measurements $/ 6 C , where <i> is some known k x n 
matrix, called the measurement matrix. In the interesting case 
k < n, the problem is underdetermined, and we are interested 
in the sparsest solution. We can state this as the optimization 
problem 



minimize ||/*||o subject to $/* = 



(1) 



where ||/||o = |supp/| is the number of nonzero coefficients 
of /. This problem is highly non-convex. So we will consider 
its convex relaxation: 



minimize ||/*||i subject to $/* = $/, 



(2) 



where ||/|| p denotes the £ p norm throughout this paper, 
(Y^ii—i \ fi\ p ) 1 ■ Problem (|2ji can be classically reformulated 
as the linear program 

n 

minimize V" t t subject to -*</*<*, $/* = $/, 
i=i 

which can be efficiently solved using general or special 
methods of Linear Programming. Then the main question is: 

Under what conditions on $ are problems Q and 

(|2j equivalent? 

In this paper, we will be interested in the exact reconstruction, 
i.e. we expect that the solutions to © and are equal to each 
other and to /. Results for approximate reconstruction can be 
derived as consequences, see [4]. 

For exact reconstruction to be possible at all, one has to 
assume that the signal / is r-sparse, that is supp(/) < r, 
and that the number of measurements k = k(r, n) has to 
be at least twice the sparsity r. Our goal will be to find 
sufficient conditions (guarantees) for the exact reconstruction. 
The number of measurements k(r, n) should be kept as small 
as possible. Intuitively, the number of measurements should 
be of the order of r, which is the 'true' dimension of /, rather 
than the nominal dimension n. 

Various results that appeared over the last two years demon- 
strate that many natural measurement matrices $ yield exact 
reconstruction, with the number of measurements k(r, n) = 
0{r ■ polylog(n)), see [2], [4], [5], [23]. In Sections EJ and 
IIVI we improve best known estimates on k for Fourier (and, 
more generally, nonharmonic Fourier) and Gaussian matrices 
respectively. 

A general sufficient condition for exact reconstruction is 
the restricted isometry condition on $, due to Candes and 
Tao ([5], see [3]). It roughly says that the matrix $ acts as 
an almost isometry on all (3(r)-sparse vectors. Precisely, we 
define the restricted isometry constant 5 r to be the smallest 
positive number such that the inequality 

C(l - <5r)||x||i < \\$ T x\\l < C(l + S r )\\x\\l (3) 

holds for some number C > and for all x and all subsets 
T C {1, . . . , n} of size \T\ < r, where <5> T denotes the k x \T\ 
matrix that consists of the columns of $ indexed by T. The 
following theorem is due to Candes and Tao ([5], see [3]). 

Theorem 2.1 (Restricted Isometry Condition): Let $ be a 
measurement matrix whose restricted isometry constant sat- 
isfies 

S 3r + 3S 4r < 2. (4) 

Let f be an r-sparse signal. Then the solution to the linear 
program (|2j is unique and is equal to /. 

This theorem says that under the restricted isometry con- 
dition © on the measurement matrix <1>, the reconstruction 
problem Q is equivalent to its convex relaxation Q for all 
r-sparse functions /. 

A problem with the use of Theorem l2.1l is that the restricted 
isometry condition © is usually difficult to check. Indeed, the 



number of sets T involved in this condition is exponential in r. 
As a result, no explicit construction of a measurement matrix 
is presently known that obeys the restricted isometry condition 
@. All known constructions of measurement matrices are 
randomized. 

III. Reconstruction from Fourier measurements 

Our goal will be to reconstruct an r-sparse signal / e C™ 
from its discrete Fourier transform evaluated at k = k(r,n) 
points. These points will be chosen at random and uniformly 
in {0, . . . , n — 1}, forming a set ft. 

The Discrete Fourier transform / = \&/ is defined by the 
DFT matrix \P with entries 

# u t — —— e^p(—i2irujt/n), cj,t € {0, . . . ,n — 1}. 

So, our measurement matrix $ is the submatrix of VP con- 
sisting of random rows (with indices in fl). To be able to 
apply Theorem 12.11 it is enough to check that the restricted 
isometry condition © holds for the random matrix $ with 
high probability. The problem is - what is the smallest number 
of rows k(r, n) of $ for which this holds? With that number, 
Theorem l2. ll immediatelv implies the following reconstruction 
theorem for Fourier measurements: 

Theorem 3.1 (Reconstruction from Fourier measurements): 
A random set f2 £ {0, . . . ,n — 1} of size k(r, n) satisfies the 
following with high probability. Let f be an r-sparse signal 
in C™. Then f can be exactly reconstructed from the values 
of its Fourier transform on Q as a solution to the linear 
program 

minimize ||/*||i subject to f*(u>) — /(w), w£0. 

The central remaining problem, what is the smallest value 
of k(r,n), is still open. The best known estimate is due to 
Candes and Tao [4]: 

k(r,n) = 0(rlog 6 7i). (5) 

The conjectured optimal estimate would be 0(r log n), which 
is known to hold for nonuniveral measuremets, i.e. for one 
sparse signal / and for a random set £1 [2]. 

In this paper, we improve on the best known bound (|5jl: 

Theorem 3.2 (Sample size): Theorem 13.71 holds with 

k{r, n) — 0(r\og(n) ■ log 2 (r) log(r logn)). 

The dependence on n is thus optimal within the log logn 
factor and the dependence on r is optimal within the log 3 r 
factor. So, our estimate is especially good for small r, but our 
estimate always yields k(r,n) = 0(r log n). 

Remark 3.3: Our results hold for transforms more general 
than the discrete Fourier transform. One can replace the DFT 
matrix ^ by any orthogonal matrix with entries of magnitude 
0(l/i/n). Theorems 13.11 and 13 . 21 hold for any such matrix. 

In the remainder of this section, we prove Theorem l3.2l Let 
fl be a random subset of {0, . . . , n} of size k. Recall that the 



measurement matrix $ that consists of the rows of \& whose 
indices are in f2). In view of Theorem [3] it suffices to prove 
that the restricted isometry constant 8 r of $ satisfies 



I8 r < e 



whenever 



fc>c(^)log(-^)log 2 



(6) 



(7) 



where e > is arbitrary, and C is some absolute constant. 

Let yi, . . . , yk denote the rows of the matrix ^. Dualizing 
we see that (jfji is equivalent to the following inequality: 



E sup 

\T\<r 



id ( 



< e 



with C = \j\[C. Here and thereafter, for vectors x,y € 
C™ the tensor x y is the rank-one linear operator given by 
(x®y)(z) — (x,y)z, where (•) is the canonical inner product 
on C". The notation x T stands for the restriction of a vector 
x on its coordinates in the set T. The operator id c r in 
is the identity on C T , and the norm is the operator norm for 
operators on fj. 

The orthogonality of '5 can be expressed as idc = 
Y^i=o Vi ® Vi- W e shall re-normalize the vectors yi, letting 
Xi = \fn yi-i. Now we have ||xi||oo = 0(1) for all i. The 
proof has now reduced to the following probabilistic statement, 
which we interpret as a law of large numbers for random 
operators. 

Theorem 3.4 (Uniform Operator Law of Large Numbers): 
Let x\ , . . . , x n be vectors in C" with uniformly 
bounded entries: H^Hoo < K for all i. Assume that 



idc" 



— — X4L1 x i ® %i- Let ft be a random subset of 
. , n} of size k. Then 



E sup 

|T|<r 



ids 



< e 



(8) 



provided k satisfies ( with constant C that may depend on 
K). 

Theorem 13.41 is proved by the techniques developed in 
Probability in Banach spaces. The general roadmap is similar 
ton [21], [22]. We first observe that 



1 ™ 



xj ®xj 



ien i=i 

so the random operator whose norm we estimate in l|8} has 
mean zero. Then the standard symmetrization (see [27] Lemma 
6.3) implies that the left-hand side of l|8} does not exceed 

2 E sup II - £i xf (g) xf 

where (e;) are independent symmetric { — 1, l}-valued random 
variables; also (jointly) independent of f2. Then the conclusion 
of Theorem 13.41 will be easily deduced from the following 
lemma. 



Lemma 3.5: Let x±, . . . , Xk, k < n, be vectors in C n with 
uniformly bounded entries, ||oo < K for all i. Then 



e sup y 



k 



< k\ sup 



x, <g> x'a 



(9) 



where k\ < Ci(K)y/r\og(r)^/logn^/logk. 



Let us show how Lemma 1331 implies Theorem l3.4l We first 
condition on a choice of il and apply Lemma l3~5l for Xi, i £ fl. 
Then we take the expectation with respect to ft. We then use 
the a consequence of Holder inequality, E(|X|5) < (E|X|)2 
and the triangle inequality. Let us denote the left hand side of 
© by E. We obtain: 



E <^E sup \\~YxT> 



It follows that E < C 2 ^, provided that ^ = 0(1). 
Theorem 13 .41 now follows from our choice of k = k(r, n). 



Hence it is only left to prove Lemma 13751 Throughout the 
proof, Bp and Bj denote the unit ball of the norm || • || p on 
C n . To this end, we first replace Bernoulli r.v.'s by standard 
independent normal random variables gi, using a comparison 
principle (inequality (4.8) in [27]). Then our problem becomes 
to bound the Gaussian process, indexed by the union of the 
unit Euclidean balls Bf in C T for all subsets I of {1, . . . , n} 
of size at most r. We apply Dudley's inequality (Theorem 
11.17 in [27]), which is a general upper bound on Gaussian 
processes. Let us denote the left hand side of (JHJ by i?!. We 
obtain: 



Ei < C 3 E sup y^ffj xf (g) xf 



= C 3 E sup y^g t (xj,x) J 



i=i 

k 



\T\<r I j=1 

xeBT 



poo 

< C 4 / log 1/2 N( U| T |< r Bf, S, u) du, 
Jo 

where N(Z, S, u) denotes the minimal number of balls of 
radius u in metric 5 centered in points of Z, needed to cover 
the set Z. The metric S in Dudley's inequality is defined by 
the Gaussian process, and in our case it is 



S(x,y) = {{xi,x) 2 - {x l ,y) 2 ) 2 



k 



< 



< 2 max 

|T|<r 

zeBl 



(x l ,z) 2 m&x\{xi,x- y)\ 

J i<k 



2i?max|(x.i,x — y)|, 

i<k 



where 



Hence 



R := sup || y^fff 8) 

m<r" i=1 



Bi < C 5 RV^ J log 1 ' 2 N^D^, || • \\ x ,u) du. (10) 



Here 



D p' n = U B p> \\x\\x = m*x\(xi,x) 

|T|<r 



We will use containments 



1 



-p£>2'" C C KB X , D'{ n CB? 



(11) 



where i?x denotes the unit ball of the norm || • The second 
containment follows from the uniform boundedness of (xi). 
We can thus replace in JlOb by Dp". Comparing (II Oi 

to the right hand side of (|9) we see that, in order to complete 
the proof of Lemma 13.51 it suffices to show that 



A" 



log 1/2 N(D r / 



\\x,u)du< C 6 log(r) 0og n ^ log k, 

(12) 

with C*6 = Cg(K). To this end, we will estimate the covering 
numbers in this integral in two different ways. For big u, we 
will just use the second containment in ( II Q , which allows us 
to replace D r { n by B\. 

Lemma 3.6: Let xi, . . . , Xk, k < n, be vectors as in Lemma 
13.51 Then for all u > we have 

N(B?,\\-\\ x ,u)<(2n) m , 

where m — C^K 2 \og(k)/u 2 . 

Proof: We use the empirical method of Maurey. Fix a 
vector y £ B r {. Define a random vector Z £ M. n that takes 
values (0, . . . , 0, sign(y(i)), 0, . . . ,0) with probability \y(i)\ 
each, i — 1, . . . , n (all entries of that vector are zero except 
i-th). Here sign(z) = z/\z\, whenever z^O, and otherwise. 
Note that KZ = y. Let Z\ , . . . , Z m be independent copies of 
Z. Using symmetrization as before, we see that 



Eo := E 



7'f'L 

m £ — < J 



< 



\ " 



3=1 



^3 
3=1 



X 



Now we condition on a choice of (Zj) and take the expectation 
with respect to random signs (sj). Using comparison to 
Gaussian variables as before, we obtain 



E A 



3=1 3=1 



A" 



C7E max > 
i<fc I ^— ' 
" 3=1 



For each i, 74 := ~}2" l =1 gj(Zj : Xi) is a Gaussian random 
variable with zero mean and with variance 



since \(Zj,x 



< |pi||oo < Using a simple bound on the 
maximum of Gaussian random variables (see (3.13) in [27]), 
we obtain 



E4 < CVEmax^l < Cg y/ log fc max Oi < C%\J\ogkK 

i<k i<k 

Taking the expectation with respect to (Zj) we obtain 



E 3 < lE(£ 4 ) < * C * K f&. 
m \/m 



With the choice of m made in the statement of the lemma, we 
conclude that £3 < u. We have shown that for every y £ B™, 
there exists a z £ C n of the form z — — V" 1 , Zt such that 

m — 1 3 

1 1 2/ — z||x < u. Each Z 3 takes 2n values, so z takes (2n) m 
values. Hence i?" can be covered by at (2n) m balls of norm 
|| • \\x of radius u. A standard argument shows that we can 
assume that these balls are centered in points of t3™. This 
completes the proof of Lemma 13.61 ■ 

For small u, we will use a simple volumetric estimate. 
The diameter of B\ considered as a set in C" is at most 
K with respect to the norm || • \\x (this was stated as the 
last containment in fTTV ), It follows that N(B{, || • < 
(1 + 2K/u) r for all r > 0, see (5.7) in [Pi]. The set D[' n 
consists of d(r, n) = Ylj=i (™) balls of form Bf , thus 



N(D[> n ,\\ ■ \\ x ,u) <d(n,r)(l + 2K/uY 



(13) 



Now we combine the estimate of the covering number 
N(u) = log 1/2 iV(l3"M| • \\ x ,u) of Lemma 3.6, and the 
volumetric estimate (1131 . to bound the integral in d!2i . Using 
Stirling's approximation, we see that d(r,n) < (Cgn/r) r . 
Thus 



N(u) < C W y/t[y/log{n/r) + ^/\og(l + 2/u)] =: N^u), 



C 



N(u) < — v/loifeVlog^ =: N 2 (u), 
u 

where C\q = C\o(K). Then we bound the integral in (I12> as 



N(u)du< j N 1 (u)du+ / N 2 (u) du 



< 



C xl A^y\og{n/r) + log(l + 2/ A)} 
f C n log(l/A)y/]ogky/logn, 



where Cn = Cn(K). Choosing A = \j\fr, we conclude 
that the integral in (I12> is at most •v/log(n/r) + logr + 
log(r) Vl°g k\/\ogn. This proves (I12> . which completes the 
proof of Lemma 13.51 and thus of Theorems 13.41 and 13.21 ■ 



IV. Reconstruction from Gaussian measurements 

Our goal will be to reconstruct an r-sparse signal / £ R ra 
from k — k(r, n) Gaussian measurements. These are given 
by <£>/ e R fe , where $ is a k x n random matrix ('Gaussian 
matrix' in the sequel), whose entries are independent N(0, 1) 
random variables. The reconstruction will be achieved by 
solving the linear program (0. 

The problem again is to find the smallest number of mea- 
surements k(r, n) for which, with high probability, we have 
an exact reconstruciton of every r-sparse signal / from its 
measurements $/? It has recently been shown in [5], [23], 
[3] that 

k(r,n) = 0(r\og(n/r)), (14) 

and was extended in [20] to sub-gaussian measurements. 
This is asymptotically optimal. However, the constant factor 
implicit in (114-i has not been known; previous proofs of 
(114-1 yield unreasonably weak constants (of order 2, 000 and 
higher). In fact, there has not been known any theoretical 
guarantees with reasonable constants for Linear Programming 
based reconstructions. So, there is presently a gap between 
theoretical guarantees and good practical performance of re- 
construction (0 (see e.g. [3]). Here we shall prove a first 
practically reasonable guarantee of the form MAY . 

k(r,n) < cir[e 2 + log(n/r)] (l + o(l)), (15) 
ci = 6 + 4a/2 w 11.66, c 2 = 1.5. 

Theorem 4.1 (Reconstruction from Gaussian measurements): 
A k x n Gaussian matrix <E> with k > k(r,n) satisfies the 
following with probability 

1-3.5 exp( - (Vk- y/k(r, n)) 2 /^)- 

Let f be an r-sparse signal in R". Then f can be exactly 
reconstructed from the measurements $/ as a unique solution 
to the linear program l|2}. 

Our proof of Theorem 14.11 is direct, we will not use 
the Restricted Isometry Theorem 12.11 The first part of this 
argument follows a general method of [20] . One interprets the 
exact reconstruction as the fact that the (random) kernel of <E> 
misses the cone generated by the (shifted) ball of l\. Then 
one embeds the cone in a universal set D, which is easier to 
handle, and proves that the random subspace does not intersect 
D. However, to obtain good constants as in d!5l >. we will need 
to (a) improve the constant of embedding into D from [20], 
and (b) use Gordon's Escape Through the Mesh Theorem [18], 
which is tight in terms of constants. In Gordon's theorem, one 
measures the size of a set S in R™ by its Gaussian width 

w(D) = Esup(g,ai), 

xes 

where g is a random vector in R n whose components are 
independent iV(0, 1) random variables (Gaussian vector). The 
following is Gordon's theorem [18]. 



Theorem 4.2 (Escape Through the Mesh (Gordon)): Let S 
be a subset of the unit Euclidean sphere S 1 ™ -1 in R™. Let Y be 
a random (n-k)-dimensional subspace of 'R™, distributed uni- 
formly in the Grassmanian with respect to the Haar measure. 
Assume that w(S) > Vk. Then Y (IS = with probability at 
least 

1 — 3.5 exp (- (k/Vk + T- w{S)) 2 /l8\. 

We will now prove Theorem l4.ll First note that the function 
/ is the unique solution of (|2) if and only if is the unique 
solution of the problem 

minimize ||/ - g*\\i subject to $g* e Ker($) =: Y. (16) 

Y is a (n— fc) -dimensional subspace of R™. Due to the rotation 
invariance of the Gaussian random vectors, Y is distributed 
uniformly in the Grassmanian G n -k, n of (n — fc)-dimensional 
subspaces of R™, with respect to the Haar measure. 

Now, is the unique solution to J 16b if and only if is 
the unique metric projection of / onto the subspace Y in the 
norm || • This in turn is equivalent to the fact that is the 
unique contact point between the subspace Y and the ball of 
the norm || • ||i centered at /: 

(/ + H/||iB?)ny = {0}. (17) 

(Recall that B™ is the unit ball of the norm || ■ || p .) Let Cf be 
the cone in R" generated by the set /+ H/Hi-B" (the cone of 
a set A £ R" is defined as {ta \ a e A, t€ R + }). Then the 
statement that d!7i holds for all r-sparse functions / is clearly 
equivalent to 

C/ n F = {0} for all r-sparse functions /. (18) 
We can represent the cone C/ as follows. Let 

T+ = {i\ 1(1) > 0}, T~ = {j I f(i) < 0}, T = T+ U T-. 
Then 

c f = {< e R" 1 E *w - E + E 1^)1 ^ °}- 

ieT- i£T+ ieT" 

We will now bound the cone Cf by a universal set, which does 
not depend on /. 

Lemma 4.3: Consider the spherical part of the cone, Kf — 
C s n S n ~ l . Then K f C (V2 + 1)D, where 

D = convjz e S 71 - 1 I |supp(»| < r}. 

Proof: Fix a point leCfl S"^ 1 . We have 

Ewoi<v1ii<^ E i*«i<£izwi<^- 

The norm || ■ || r> on 1™ whose unit ball is D can be computed 

as 

1=1 ieii 

where L = \n/r], h = {r(l - 1) + 1, ... , rl}, for I < L, 
II = { r (L — 1) + 1> • • • j n }, an d ( x (i)*) is a non-decreasing 
rearrangement of the sequence 



Set F = F(x) = {i | \x(i)\ > l/Jr}. Since x G S 1 "" 1 , 
we have \F\ < r. Hence, for any x G K there exists a set 
£7 = E(x) C {1, . . . , m}, which consists of 2r elements and 
such that E ~D F U I. Therefore, x can be represented as 
x = x'+x" so that supp(x') C E, \\x\\ 2 < 1, supp(x") C i? c , 
lla/'H™ < Set 



Then the above argument shows that i-T/ C U|_e|=2r Ve —'■ W. 

The maximum of ||;e||.d over x G is attained at the 
extreme points of the sets Ve, which have the form x — x' + 
x", where x' G S B , and x" has coordinates and ±1/ y 7 ? with 
r non-zero coordinates. Notice that since |supp(x')| < 2r, 
ll^'lln < V^H^'lb- Thus, for any extreme point a; of Ve, 

\\x\\d < \\x'\\ D + \\x"\\ D < V2\\x'\\ 2 + \Wh < V2 + 1. 

The second inequality follows from supp(x') < 2r and 
supp(x") = r. This completes the proof of the lemma. ■ 
To use Gordon's escape through the mesh theorem, we have 
to estimate the Gaussian width of D. 

Lemma 4.4: 



w{D) < y / 2rlog(e 3 / 2 n/r)(l + o(l)). 
Proof: By definition, 

w(D)= sup (]>>(*)| 2 ) 1/2 - 
l J l- r ie.J 

Let p > 1 be a number to be chosen later. By Holder's 
inequality, we have 

^)<e(e(ei^)i 2 ) p/2n1/p 

\J\=r ieJ 



p/2\ 1/p 



(eny/Pf /2 r(p/2 + r/2) y/p 
" V r / V r(r/2) / ' 

By the Stirling's formula, 

Therefore, tu(D) < (^) r/p (E±I) 1/2 (1 + (1)). Now set 
p = 2rlog(f ). Then 



w(D) <{p + r) 1 /^! + o(l)) = ^ 



^3/2 ^-i 

2r log f— 1^(1 + (1)). 



To deduce (fl~8l we define 5 = U f X/, where the union is 
over all r-sparse functions /. Then i ll 8t is equivalent to 

sny = 0. (19) 

Lemma|431implies that 5 C (V2 + 1)13. Then by Lemma WA\ 

w(S) < (V2 + l)w(D) - (1 - o(l))VK^ n). 



Then dl9l l follows Gordon's Theorem 14.21 This completes the 
proof of Theorem 14.11 ■ 
Acknowledgement. After this paper was announced, 
A.Pajor pointed out that Lemma 3.6 was proved by B.Carl in 
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